← Back to Portfolio

The A/B Testing Masterclass

A complete guide to the statistical framework behind experimentation, followed by a real-world case study and an interactive tool to run your own tests.

Part 1: The Statistical Framework

Modern A/B testing is only reliable when experiments are designed with clear statistical guardrails. We don't just "run a test"; we design a Power Analysis to ensure we can actually detect the truth.

How Experiment Errors Are Defined

Reality \ Decision Detect Effect No Effect Detected
Effect Exists True Positive (Power) False Negative (β)
No Effect Exists False Positive (α) True Negative

The Core Formulas:

$$ n = \frac{2 \cdot (Z_{1-\alpha/2} + Z_{1-\beta})^2 \cdot p(1-p)}{(p_2 - p_1)^2} $$

Where $n$ is the minimum sample size required per variant to detect a difference between $p_1$ (Control) and $p_2$ (Test).

Part 2: Real-Life Case Study

A marketing team wants to test whether a new checkout design improves conversion rate. Before launching, they need to know: "How many users do we need?"

The Assumptions

The Calculation

$$ n = \frac{2 \cdot (1.96 + 0.84)^2 \cdot 0.055 \cdot 0.945}{(0.01)^2} \approx 8,148 $$

Conclusion: We need ~8,150 users per variant. If we launch with less, the test is mathematically invalid.

Part 3: Interactive Calculator

Adjust the parameters below and click "Run Analysis". If your sample size is lower than required, the result will be flagged.

Design Your Test (Power Analysis)

8,148
Required Sample Size (Per Variant)

Analyze Results (Post-Test)

--
Observed Lift
--
Statistical Inference
Key Takeaway: Never run a test without checking the "Required Sample Size" first. If your "Control Visitors" are lower than the "Required Sample Size", your "Statistically Significant" result might just be luck.